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Abstract 

Motivated  by  a  factory  scheduling  problem,  we  consider  the  problem  of  input  control 
(subject  to  a  specified  product  mix)  and  sequencing  in  a  two-station  multiclass  queueing 
network  with  general  service  time  distributions  and  a  general  routing  structure.  The  ob- 
jective is  to  minimize  the  long-run  average  expected  number  of  customers  in  the  system 
subject  to  a  constraint  on  the  long-run  average  expected  output  rate.  Under  balanced 
heavy  loading  conditions,  this  scheduling  problem  is  approximated  by  a  control  problem 
involving  Brownian  motion.  A  reformulation  of  this  Brownian  control  problem  was  solved 
exactly  in  Wein  [17].  In  the  present  paper,  this  solution  is  interpreted  in  terms  of  the  queue- 
ing network  model  in  order  to  obtain  an  effective  scheduling  rule.  The  resulting  sequencing 
rule  is  a  static  priority  ranking  of  the  classes.  The  input  policy  is  a  "workload  regulating" 
input  policy,  where  a  customer  is  injected  into  the  system  whenever  the  expected  total 
amount  of  work  in  the  system  for  the  two  stations  falls  within  a  prescribed  region.  An 
example  is  presented  that  illustrates  the  procedure  and  demonstrates  its  effectiveness. 
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This  research  is  motivated  by  a  particular  scheduling  problem  that  is  encountered 
in  many  factories.  By  viewing  a  factory  as  a  network  of  queues,  the  scheduling  problem 
can  be  formulated  as  one  of  controlling  the  flow  in  a  queueing  network.  The  queueing 
network  under  consideration  consists  of  two  single-server  stations  and  K  different  customer 
classes.  Customers  of  class  k  =  1, ...,  A'  require  service  at  a  specific  station  s{k)  and  their 
service  times  are  independent  and  identically  distributed  random  variables  with  finite  mean 
mjt  and  variance  s\.  Upon  completion  of  service,  a  class  k  customer  turns  next  into  a 
class  j  customer  with  probability  Pkj  and  exits  the  system  with  probability  1  —  XI 7=1  ^kj- 
independent  of  all  previous  history.  We  assume  that  the  A'  x  A'  Markovian  switching 
matrix  P  =  {Pkj)  has  spectral  radius  less  than  one,  so  that  all  customers  will  eventually 
exit  the  system.  Because  the  number  of  classes  is  allowed  to  be  arbitrary,  this  routing 
structure  is  almost  perfectly  general. 

The  scheduling  problem  incorporates  input  and  sequencing  decisions.  We  assume 
there  is  an  endless  line  of  customers  who  are  waiting  to  gain  entry  into  the  network. 
Each  customer  in  the  line  has  an  exogenously  specified  class  designation.  These  class 
designations  are  such  that,  over  the  long-run,  the  proportion  of  class  k  customers  released 
into  the  system  is  qk,  where  ^^=1  5*  =  1-  The  vector  q  -  {qk)  will  be  referred  to 
as  the  entering  class  mix.  The  input  decisions  are  to  choose  a  non-decreasing  process 
A''  =  {N{t),t  >  0},  where  N{t)  is  the  cumulative  number  of  customers  injected  into  the 
system  up  to  time  t.  Thus  the  input  decisions  essentially  allow  full  discretion  over  the 
timing  of  the  release  of  customers  into  the  system,  but  do  not  allow  for  the  choice  of  which 
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class  of  customer  to  inject. 

The  sequencing  decisions  consist  of  choosing,  at  each  point  in  time,  which  ciass  of  cus- 
tomer to  process  at  each  server  in  the  network.  Preemptive  resume  scheduling  is  allowed, 
so  that  service  of  a  customer  may  be  interrupted  at  a  particular  station  when  a  higher 
priority  customer  arrives  at  that  station.  Due  to  the  rather  crude  nature  of  the  Brownian 
approximation  that  is  employed  here,  the  assumptions  made  regarding  preemption  do  not 
have  an  effect  on  the  scheduling  policy  that  emerges  from  the  analysis. 

It  is  assumed  that  a  holding  cost  Ck  is  incurred  for  each  unit  of  time  that  a  class  k 
customer  spends  in  the  queueing  network.  Also,  there  is  a  specified  lower  bound  A  on  the 
long-run  average  expected  throughput  rate  of  the  queueing  network.  The  throughput  rate 
of  a  queueing  system  is  the  number  of  customer  departures  from  the  system  per  unit  of 
time.  Our  queueing  network  scheduling  problem  is  to  choose  the  input  and  sequencing 
decisions  so  as  to  minimize  the  long-run  average  expected  holding  costs  incurred  per  unit 
of  time,  subject  to  a  lower  bound  constraint  on  the  long-run  average  expected  throughput 
rate.  Notice  that  in  the  special  case  where  Ck  =  c  for  all  k  =  l,...,/v,  the  objective  is  to 
minimize  the  long-run  average  number  of  customers  in  the  system.  Because  the  problem  is 
formulated  in  terms  of  long-run  averages  and  because  the  constraint  on  throughput  will  in 
general  be  tight.  Little's  formula  [9]  implies  that  this  objective  is  equivalent  to  minimizing 
the  long-run  average  expected  cycle  time  of  customers  in  the  system.  The  cycle  time 
of  a  customer  is  the  amount  of  time  a  customer  spends  in  the  queueing  network.  In  a 
manufacturing  setting,  there  are  many  good  reasons  to  minimize  both  the  work-in-process 
inventory  and  the  cycle  time,  and  some  of  these  will  be  discussed  in  the  next  section. 

A  good  deal  of  literature  exists  on  input  control  of  queueing  networks,  but  these  models 
consider  the  decision  of  whether  to  accept  or  reject  Poisson  arrivals;  Stidham  [15]  provides 
a  thorough  survey  of  work  in  this  area.  Such  models  are  not  applicable  to  the  scheduling 
problem  considered  here,  since  the  relevant  issue  in  our  setting  is  when  to  release  a  customer 
into  the  queueing  network,  not  whether  or  not  to  accept  the  customer.    Although  useful 
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results  exist  for  sequencing  single-station  systems  (see  Klimov  [8]),  a  satisfactory  theory 
for  sequencing  in  a  network  setting  has  not  been  attained,  and  simulation  (see  Conway, 
Maxwell  and  Miller  [2]  for  a  classic  study  on  this  topic)  is  still  the  primary  tool  of  analysis. 
In  view  of  the  difficulty  in  obtaining  sequencing  rules  for  conventional  multiclass  queueing 
networks  (it  has  been  14  years  since  Klimov's  result),  the  best  hope  for  further  progress 
appears  to  be  in  the  analysis  of  cruder,  more  tractable  models. 

One  such  model  is  a  Brownian  network,  a  stochastic  system  model  introduced  by 
Harrison  [4].  Under  conditions  of  balanced  heavy  loading,  a  Brownian  network  approxi- 
mates a  multiclass  queueing  network  with  dynamic  scheduling  capability.  To  state  these 
conditions  more  precisely,  let  the  two- vector  p  =  (p,)  be  the  relative  server  utilizations,  or 
traffic  intensities,  for  the  two  stations.  The  values  of  pi  and  p2  can  be  computed  from  the 
switching  matrix  P,  the  vector  m  —  {rrik)  of  expected  processing  times,  the  entering  class 
mix  q  =  {qk)  and  the  specified  average  throughput  rate  A,  as  will  be  shown  in  Section  2. 
The  balanced  heavy  loading  conditions  assume  the  existence  of  a  large  integer  n  such  that 
0  <  >/n(l  —  Pi)  <  1  for  i  =  1,2.  As  a  canonical  example,  one  may  think  of  p\  =  P2  =  -9, 
in  which  case  n  =  100  satisfies  this  condition. 

Under  such  conditions,  the  scheduling  problem  described  above  can  be  approximated 
by  a  dynamic  control  problem  for  a  Brownian  network.  The  state  of  the  system  in  this 
Brownian  control  problem  is  a  7v-dimensional  vector  queue  length  process  (appropriately 
scaled).  Instead  of  analyzing  the  Brownian  control  problem  directly,  the  problem  is  refor- 
mulated in  Wein  [17]  so  that  the  state  of  the  system  is  described  by  a  two-dimensional 
process  that  represents  the  scaled  version  of  the  total  amount  of  work  in  the  system  for 
each  of  the  two  stations.  The  reformulated  problem  is  solved  exactly  in  Wein  [17],  and  in 
the  present  paper,  the  solution  is  interpreted  in  terms  of  the  queueing  system  in  order  to 
obtain  an  effective  scheduling  rule  for  the  original  queueing  network  (and  hence  factory) 
scheduling  problem.  This  interpretation  is  based  on  intuition  obtained  from  existing  heavy 
traffic  limit  theorems  for  some  simpler  queueing  systems,  and  no  attempt  is  made  to  rig- 

3 


orously  justify  our  interpretation  via  a  weak  convergence  result.  However,  we  conjecture 
that  the  resulting  scheduling  rule  is  asymptotically  optimal  in  the  heavy  traffic  limit  (i.e., 
as  n  — ►  oo). 

The  scheduling  rule  derived  here  consists  of  a  sequencing  rule  and  an  input  policy.  To 
describe  the  rule,  a  few  definitions  are  needed.  Let  Af,jt  equal  the  expected  total  amount 
of  time  that  the  server  at  station  i  (hereafter  referred  to  as  server  i)  must  devote  to  a  class 
k  customer  before  that  customer  eventually  exits  the  network.  Denote  the  ivT-dimensional 
queue  length  process  by  Q,  so  that  Qki't)  is  the  number  of  class  k  customers  in  the  system 

at  time  t  for  k  =   1 A'.     Defining  a  two-dimensional  workload  process  w  =  {wi)  by 

w{t)  =  MQ{t),  where  M  —  (A/,jt),  we  interpret  w,{t)  as  the  expected  total  amount  of 
work  for  server  i  embodied  in  those  customers  who  are  present  anywhere  in  the  network 
at  time  t. 

Recalling  that  Ck  is  the  linear  holding  cost  for  a  class  k  customer,  the  sequencing 
rule  ranks  each  customer  class  k  by  the  index  c'j^^ {p2^hk  —  P\M2k)-  In  the  special  case 
where  cjt  =  c  for  all  k  =  1,...,A',  this  rule  is  a  static  priority  ranking  that  awards  higher 
priority  at  station  1  (respectively,  station  2)  to  the  smaller  (respectively,  larger)  values  of 
this  index.  (The  case  where  Ck  ^  c  for  all  k  =  1,...,A'  will  be  discussed  in  Section  5).  It 
is  interesting  to  note  that,  as  in  Klimov's  results  for  a  single-station  queueing  system,  the 
solution  to  a  dynamic  scheduling  problem  is  a  static  priority  ranking  of  the  classes,  and 
the  solution  depends  on  the  general  service  time  distributions  only  through  their  means. 

This  sequencing  rule  has  the  following  interpretation.  In  the  special  case  when  Ck  =  c 
for  all  k  =  I, ...,  A'  and  pi  =  p2  (i.e.,  minimization  of  the  cycle  time  in  a  perfectly  balanced 
system),  the  rule  tends  to  retain  jobs  at  each  station  (by  giving  them  lower  priority)  that 
have  relatively  more  work  to  be  done  at  that  station,  either  now  or  later,  dispatching  more 
quickly  (by  giving  them  higher  priority)  jobs  that  have  relatively  more  work  to  be  done  at 
the  other  station.  When  pi  ^  P2,  then  pi  and  p2  show  up  as  appropriate  weighting  factors. 
Incidentally,  it  is  known  (Harrison  and  Wein  [5])  that  when  cjt  =  c  for  all  k  =  1,...,A', 
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this  same  sequencing  rule  maximizes  the  throughput  rate  in  a  two-station  multiclass  closed 
queueing  network  in  heavy  traffic. 

The  input  rule  is  called  a  workload  regulating  policy  because  it  depends  solely  on  the 
two-dimensional  workload  process  w.  More  specifically,  the  rule  releases  a  new  customer 
into  the  system  whenever  the  workload  process  enters  a  certain  region  in  the  nonnegative 
orthant  of  R^.  A  description  of  this  region  is  fairly  involved  and  will  be  deferred  until 
Section  6,  where  the  region  is  calculated  explicitly.  For  a  typical  example,  interested  readers 
may  refer  to  Figure  2  of  Section  7,  where  the  region  consists  of  the  shaded  area.  The  input 
rule  causes  the  network  to  behave  as  a  "pull"  system:  when  either  server  appears  to  be 
threatened  with  idleness  and  there  is  not  too  much  work  already  present  in  the  system,  a 
new  customer  is  released  into  the  system. 

Although  neither  the  sequencing  nor  input  rule  derived  here  has  ever  appeared  in  the 
literature,  they  are  both  intuitively  appealing  policies.  Furthermore,  in  a  manufacturing 
setting,  they  would  be  very  easy  to  implement.  As  will  be  seen  in  Section  7,  these  policies 
outperform  conventional  scheduling  rules  in  simulation  studies. 

The  original  system  description  of  a  two-station,  heavily-loaded,  well-balanced  net- 
work may  seem  quite  restrictive  at  first  glance.  However,  one  important  implication  of 
the  balanced  heavy  loading  assumption  is  that,  in  the  heavy  traffic  limit  represented  by 
the  Brownian  network  model,  any  stations  in  the  original  system  that  are  not  among  the 
most  heavily  loaded  will  simply  disappear.  This  has  been  proven  in  limit  theorems  by 
Johnson  [7]  and  Chen  and  Mandelbaum  [1]  in  the  single-type  open  queueing  network  set- 
ting. Limit  theorems  of  this  type  can  justify  the  procedure  of  eliminating  all  stations  that 
are  not  heavily  loaded  when  forming  the  approximating  Brownian  network,  reducing  the 
original  system  to  a  subnetwork  of  bottleneck  stations  for  purposes  of  subsequent  analysis. 
However,  these  bottleneck  stations  are  precisely  where  the  large  queues  form,  where  most 
of  the  waiting  is  incurred,  and  thus  where  scheduling  will  have  the  biggest  impact.  In 
fact,  other  approaches  to  job  shop  scheduling  problems,  such  as  the  the  OPT  system  (see 
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Jacobs  [6]  for  a  critical  evaluation  of  its  main  features)  or  the  expert  systems  approach 
taken  by  Morton  and  Smunt  [10],  also  focus  on  the  bottleneck  stations.  Thus,  although  the 
Brownian  network  approximation  is  a  rather  crude  model  in  comparison  to  a  conventional 
queueing  network,  its  underlying  assumptions  are  made-to-order  for  scheduling  purposes. 

One  consequence  of  the  previous  paragraph  is  that  the  scheduling  rule  emerging  from 
our  analysis  can  be  applied  to  any  queueing  network  with  two  bottleneck  stations.  In  fact, 
a  simulation  model  has  been  built  that  is  based  on  operating  data  from  an  actual  semicon- 
ductor wafer  fabrication  facility.  Using  this  simulation  model  (see  Wein  [16]  for  details), 
which  contains  24  stations  but  only  two  bottleneck  stations,  rules  similar  to  the  ones  de- 
rived here  were  compared  against  conventional  sequencing  and  input  rules.  The  results 
were  quite  impressive:  the  rules  outperformed  conventional  rules  and  achieved  a  47.2% 
reduction  in  average  customer  queueing  time  versus  the  base  case  of  Poisson  inputs  and 
first-in  first-out  (FIFO)  sequencing. 

This  paper  is  organized  as  follows.  The  factory  scheduling  problem  that  motivates  our 
study  is  discussed  in  Section  1.  In  Section  2  the  Brownian  approximation  of  the  queueing 
network  scheduling  problem  is  stated.  The  Brownian  control  problem  is  reformulated  in 
Section  3  and  the  solution  to  the  reformulated  problem,  which  was  derived  in  Wein  [17],  is 
stated  in  Section  4.  This  solution  is  interpreted  in  terms  of  the  original  queueing  system 
in  Sections  5  and  6,  in  order  to  obtain  a  sequencing  rule  and  an  input  policy,  respectively. 
An  example  is  presented  in  Section  7  that  illustrates  the  procedure  and  demonstrates  its 
effectiveness. 

Some  of  the  notational  conventions  and  terminology  used  in  this  paper  will  now  be 
introduced.  A  stochastic  process  is  said  to  be  RCLL  if  its  sample  paths  are  right  continuous 
and  have  left  limits  with  probability  one.  When  we  say  that  A'  is  a  {^,a^)  Brownian 
motion,  it  is  assumed  there  is  a  given  (Q,  F,  Ft,  AT,  Pj:),  where  (n,F)  is  a  measurable 
space,  ,Y  =  A''(a;)  is  a  measurable  mapping  off!  into  C(R),  which  is  the  space  of  continuous 
functions  on  the  real  line  R,  Ft  =  a(A'(5),5  <  t)  is  the  filtration  generated  by  X,  and  Pi 
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is  a  family  of  probability  measures  on  fi  such  that  the  process  {X{t),t  >  0}  is  a  Brownian 
motion  with  drift  /i,  variance  cr'^  and  initial  state  x.  Let  E^  be  the  expectation  operator 
associated  with  P^.  If  Y  =  {Y'{t),t  >  0}  is  a  process  that  is  Ft-measurable  for  all  t  >  0, 
then  we  say  that  the  process  Y  is  non-cLnticipating  with  respect  to  the  Brownian  motion 
X.  More  generally,  we  will  say  that  one  process  Y  is  non-anticipating  with  respect  to 
another  process  X  when  Y  is  adapted  to  the  coarsest  filtration  with  respect  to  which  X  is 
adapted. 

1.     The  Factory  Scheduling  Problem 

This  section  describes  the  relationship  between  the  queueing  network  scheduling  prob- 
lem and  the  factory  scheduling  problem.  Each  server  in  the  queueing  network  corresponds 
to  a  machine  or  work  center  in  the  factory,  and  each  customer  corresponds  to  a  particular 
job.  The  routing  structure  described  in  the  introduction  can  accomodate  the  case  where 
the  factory  produces  a  variety  of  products,  each  with  its  own  arbitrary  deterministic  route 
through  the  network  of  machines.  In  that  case,  a  different  customer  class  is  defined  for 
each  combination  of  product  and  stage  of  completion.  More  generally,  our  set-up  allows 
probabilistic  routing  to  represent  such  events  as  rework  or  scrapping.  In  fact,  a  customer 
class  can  include  any  observable  information  about  a  particular  job  that  is  relevant  for 
dynamic  scheduling  purposes. 

The  queueing  network  model  can  also  accomodate  machine  breakdown  and  repair. 
By  assuming  that  the  amount  of  machine  busy  time  between  consecutive  breakdowns  is 
exponentially  distributed,  the  breakdown  and  repair  can  be  incorporated  into  the  service 
time  distributions  for  each  customer  class;  see  Harrison  [4]  for  details.  The  modified  rrik 
and  si  are  interpreted  as  the  mean  and  variance  of  the  effective  service  time  of  a  class  ^- 
customer,  i.e.,  the  actual  processing  time  plus  the  total  duration  of  all  interruptions  that 
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occur  during  that  service. 

In  the  manufacturing  setting,  the  sequencing  decisions  consist  of  dynamically  choosing 
which  job  to  process  at  each  machine  in  the  factory;  this  corresponds  to  the  classic  job 
shop  scheduling  problem.  The  input  decisions  in  our  problem  specify  the  timing  of  the 
release  of  jobs  onto  the  factory  floor.  However,  it  is  assumed  that  the  exact  sequence  of 
entering  product  types  is  specified  precisely.  This  sequence  reflects  the  desired  product 
mix  that  the  factory  is  required  to  maintain.  For  example,  if  a  factory  makes  two  products, 
A  and  B,  to  be  produced  in  equal  quantities,  then  the  specified  sequence  of  entering  jobs 
would  be  ABABAB...  There  is  a  long-run  average  output  rate  (in  jobs  per  unit  time)  that 
the  factory  is  required  to  maintain.  When  the  holding  costs  Ck  =  c  for  all  k  =  1, ....  A',  the 
objective  is  to  minimize  the  long-run  average  expected  work-in-process  (WIP)  inventory, 
which  is  equivalent  to  minimizing  the  long-run  average  expected  cycle  time  of  jobs. 

This  scheduling  problem  is  relevant  for  any  factory  that  is  obliged  to  maintain  a 
specified  average  output  rate  of  a  certain  product  mix,  but  can  control  the  timing  of  its 
inputs.  In  thinking  about  endogenously  generated  arrivals,  it  is  easiest  to  imagine  a  make- 
to-stock  manufacturer,  where  orders  are  met  from  finished  goods  inventory.  However,  in 
a  make-to-order  environment,  input  to  the  factory  floor  can  also  be  regulated,  but  then 
customer  orders  will  sometimes  queue  outside  the  factory  floor  waiting  to  gain  entrance. 
The  motivation  for  doing  this  is  to  reap  the  benefits  that  can  be  gained  by  a  reduction 
in  both  the  WIP  inventory  on  the  factory  floor  and  the  cycle  time  of  jobs  on  the  factory 
floor.  By  reducing  the  number  of  jobs  on  the  factory  floor,  the  benefits  from  Just-In- 
Time  manufacturing  (see  Schonberger  [13]  for  a  detailed  description)  can  be  realized.  For 
example,  quality  problems  will  be  detected  faster,  and  thus  there  will  be  less  rework  and 
scrap  of  jobs.  By  reducing  the  cycle  time  of  jobs,  the  factory  can  gain  Rexibility:  the  system 
will  be  more  capable  of  very  fast  turnaround  on  individual  orders,  and  the  factory  may 
more  readily  adapt  to  a  changed  order,  since  the  corresponding  job  may  not  have  begun 
its  processing.    A  more  specific  example  occurs  in  the  semiconductor  industry,  where  a 
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decrease  in  the  average  cycle  time  of  a  lot  of  wafers  in  the  wafer  fab  will  result  in  an 
increase  in  the  yield  of  good  wafers.  This  is  because  lots  are  so  easily  contaminated  while 
in  the  fab.  Finally,  in  the  case  of  standardized  products  that  can  be  made  to  stock,  shorter 
cycle  times  allow  production  to  be  based  on  more  accurate  forecasts  of  market  demand. 

Since  our  definition  of  cycle  time  does  not  include  the  time  that  transpires  between 
receiving  an  individual  order  and  releasing  the  corresponding  job  onto  the  factory  floor, 
readers  may  be  concerned  about  the  effect  the  rules  derived  here  would  have  on  due-date 
performance.  Our  view  is  that,  in  the  case  of  a  busy  factory  with  more  than  one  bottleneck 
machine,  scheduling  for  due-dates  has  a  detrimental  effect  on  the  utilization  of  bottleneck 
machines,  and  hence  ultimately  does  more  harm  than  good.  As  an  example,  consider  a 
two-station  well-balanced  factory  that  has  a  very  large  backlog  of  jobs,  each  with  a  given 
due  date.  Furthermore,  suppose  that  either  workload  regulating  input  (see  Section  6) 
or  closed  loop  input  (the  total  number  of  jobs  on  the  factory  floor  is  held  constant,  see 
Solberg  [14])  is  used.  In  these  cases,  the  sequencing  rule  described  here  can  substantially 
increase  server  utilization  compared  to  any  sequencing  rule  that  sequences  according  to 
due-date  information  (see  Harrison  and  Wein  [5]  or  Wein  [17]).  This  sequencing  rule  will 
allow  the  factory  to  produce  more  jobs  per  unit  time  and  thus  would  eventually  provide 
more  timely  customer  service  than  a  myopic  sequencing  rule  that  is  based  on  due-dates. 
(Notice  that  the  above  argument  does  not  hold  for  a  factory  with  only  a  single  machine. 
This  is  because,  in  a  single-server  queueing  system,  every  work-conserving  sequencing  rule 
achieves  the  same  server  utilization. 

In  summary,  the  research  undertaken  here  attempts  to  realistically  incorporate  the 
dynamic  and  stochastic  elements  that  are  inherent  in  all  factory  scheduling  problems. 
Furthermore,  we  believe  that  factories,  by  focusing  on  system  performance  measures  (such 
as  WIP  inventory  and  cycle  time)  rather  than  due-date  performance  measures,  can  take 
advantage  of  some  benefits  of  Just-In-Time  manufacturing  and  can  provide  better  customer 
service  over  the  long  run. 
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2.     The  Limiting  Control  Problem 

We  assume  readers  are  familiar  with  the  approximating  Brownian  network  model  put 
forth  in  Harrison  [4];  Most  of  that  paper's  notation  will  be  retained  for  ease  of  reference.  It 
follows  from  Section  9  of  Harrison  [4]  that  under  the  balanced  heavy  loading  assumptions, 
the  queueing  network  scheduling  problem  described  in  the  introduction  can  be  approxi- 
mated by  the  following  limiting  control  problem:  choose  a  pair  of  RCLL  processes  Y  and 
6  (K-dimensional  and  one-dimensional,  respectively)  to 

1        r^  ^ 

minimize    limsup  — E'j.f  /      2.^kZk{t)dt]  (2-1) 

T— oo    T       Jo    f^^ 

subject  to    }'    and  0  are  non  —  anticipating  with  respect  to  X,                 (2-2) 

Zit)  =  X{t)  +  RY{t)  -  qe{t)    for  all  t  >  0,  (2.3) 

U{t)  =  AY{t)    for  all  t  >  0,  (2.4) 

U  is  non  —  decreasing  with  U{0)  =  0,  (2-5) 

Z{t)  >   0    for  all  t  >  0,  and  (2.6) 

lim  sup  ^E[U,{T)]    <   7.    fori  =  1,2.  (2.7) 

T— ►oo     -^ 

The  process  Z  represents  the  A'-dimensional  scaled  queue  length  process  and  describes 
the  state  of  the  system.  The  A'-dimensional  process  Y  represents  the  scaled  centered 
allocation  process  and  the  one-dimensional  process  0  represents  the  scaled  centered  input 
process.  These  two  control  processes  correspond  to  the  sequencing  and  input  decisions, 
respectively.  Interested  readers  are  referred  to  Harrison  [4]  for  an  explicit  definition  of  the 
process  Y,  since  the  definition  will  not  be  needed  here.  As  in  Harrison  [4],  exactly  the  same 
notation  used  for  the  scaled  processes  are  used  in  defining  the  approximating  Brownian 
control  problem.   This  is  done  in  order  to  emphasize  the  queueing  network  interpretation 
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of  the  Brownian  network  model.  The  scaled  process  B  is  defined  by 

\nt-Nint)  ,^^^ 

e{t)  =  ^^-^,  i>0,  2.8 

where,  as  mentioned  earlier,  A  is  the  specified  average  throughput  rate,  N{t)  is  the  cumu- 
lative number  of  customers  released  into  the  network  in  [0,t]  and  n  is  the  large  integer 
specified  in  the  balanced  heavy  loading  condition. 

The  two-dimensional  process  U  represents  the  scaled  cumuiative  idleness  process  for 
the  two  stations.  (For  brevity's  sake,  processes  such  as  Z,  Y,  Ua.nd  9  will  often  be  referred 
to  without  the  adjective  "scaled".)  The  A'  x  A'  input-output  matrix  R  —  [Rkj)  is  defined 

by 

Rkj=rn-\8jk-P,k).  (2.9) 

where  6jk  denotes  the  Dirac  delta  function,  meaning  that  8jk  =  I  \i  j  =  k  and  8jk  =  0 
otherwise.  The  2  x  A'  resource  consumption  matrix  A  =  (A,fc)  is  defined  by 

fl,     ifi  =  s(k);  ,^    Q. 

"^'^-lo,     otherwise.  ^''      ' 

The  A'-dimensional  process  X  is  a  [6,  S)  Brownian  motion,  but  several  definitions  are 
needed  before  stating  the  A'-dimensional  drift  vector  6  =  {6k)  and  the  A'  x  A'  covariance 
matrix  E  =  (Ej/).  Let  A  =  (A;t)  be  defined  by 

A  =  gA,  (2.11) 

so  that  Ajt  represents  the  average  number  of  class  k  customers  that  must  arrive  to  the 
system  per  unit  of  time  in  order  to  satisfy  the  throughput  rate  constraint. 

Since  P  was  assumed  to  be  transient,  it  follows  that  R  is  non-singular  and  there  exists 
a  unique  non-negative  A'— vector  l3  —  {3k)  satisfying  the  flow  balance  equations 

A  =  RI3.  (2.12) 
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Letting  C{i)  be  the  set  of  all  customer  classes  k  such  that  s{k)  =  i,  define  the  two-vector 
of  traffic  intensities  p  —  {p,)  by 


P.  =    ^    f3k.  (2.13) 

itGC(i) 

Now  define  the  iv-vector  a  =  (q^)  by 

ak  =  —    for  all  k  E  0(1).  (2.14) 

P. 

Then  the  drift  6  and  covariance  E  of  the  Brownian  motion  X  are 

S=  -fV"(A-i?Q)    and  (2.15) 

K 

S;(  =  ^[afcm;^Pfc,(<5,,  -  Pki)  +  akm-'slR,kRik]-  (2.16) 

k  =  l 

Inequality  (2.7),  which  expresses  the  throughput  rate  constraint  in  terms  of  the  cu- 
mulative server  idleness  process  U,  is  the  only  relationship  in  the  limiting  control  problem 
that  does  not  appear  in  the  Brownian  network  formulation  of  [4].  The  two- vector  7  =  (ji) 
in  (2.7)  is  defined  by 

7.  =  y^(l-p.).  (2.17) 

To  derive  (2.7),  let  the  2  x  A'  matrix  M  =  (M,jt)  be  defined  by 

M  =  AR-\  (2.18) 

M  is  called  the  workload  profile  matrix,  and  M,fc  is  interpreted  as  the  expected  total 
amount  of  time  that  server  i  must  devote  to  a  class  k  customer  before  that  customer  exits 
the  network.  Define  the  two-dimensional  vector  v  =  (v,)  by 

v  =  Mq,  (2.19) 

so  that  V,  is  interpreted  as  the  expected  total  amount  of  time  over  the  long-run  that  server 
i  spends  on  each  customer.  From  (2.11)-(2.13)  and  (2.18)-(2.19),  it  follows  that 

p,  =  v,X   for  f  =  1,2.  (2.20) 
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Inequality  (2.7)  follows  from  (2.17)  and  (2.20),  since  the  long-run  average  throughput  rate 
is  greater  than  or  equal  to  A  if  and  only  if  the  long-run  average  fraction  of  time  that  server 
i  is  idle  is  less  than  or  equal  to  1  —  p^. 

3.     The  Workload  Formulation 

The  state  of  the  system  in  the  limiting  control  problem  is  described  by  a  A'— dimension- 
al queue  length  process,  by  way  of  the  basic  system  relationship  (2.3).  In  this  section  the 
limiting  control  problem  is  reformulated  so  that  the  state  of  the  system  is  described  by  a 
two-dimensional  workload  process.  Recalling  the  definition  (2.18)  of  the  workload  profile 
matrix  M,  let  us  define  the  two-dimensional  scaled  workload  process  W  —  {W,)  by 

W{t)  =  MZ{t),  t  >0,  (3.1) 

where  Wi{t)  is  interpreted  as  the  expected  total  amount  of  work  for  server  i  embodied 
in  those  customers  who  are  present  anywhere  in  the  network  at  time  t.  Define  the  two- 
dimensional  Brownian  motion  B{t)  =  {B,{t))  by' 

B{t)  =  MX{t),  t  >  0.  (3.2) 

The  process  B  has  drift  M6  and  covariance  MUM'^ .  By  (2.10),  (2.12)-(2.15)  and  (2.17)- 
(2.18),  one  can  show  that  the  two-dimensional  drift  vector  M8  =  —7. 

Define  the  workload  formulation  of  the  limiting  control  problem  as  choosing  RCLL 
processes  Z,U  and  6  (K-,  two-  and  one-dimensional,  respectively)  so  as  to 


mmimize 


limsupiE.i/    YckZk{t)dt\  (3.3) 


subject  to    U    and  6  are  non  —  anticipating  with  respect  to  B,  (3.4) 

U  is  non  -  decreasing  with  U{Q)  =  0,  (3.5) 
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Z{t)  >   0    for  all  t  >  0,  (3.6) 

Vim  sup -E[U,{T)]    <   7.    fori  =  1,2,    and  (3.7) 

MZ{t)  =  B{t)  +  U{t)  -  ve{t)    for  all   t>0.  (3.8) 


Let  us  call  a  pair  of  RCLL  processes  {Y,  9)  a  feasible  policy  for  the  limiting  control 
problem  if  it  satisfies  equations  (2.3)-(2.7)  and  call  a  triple  of  RCLL  processes  {Z,U,6) 
a  feasible  policy  for  the  workload  formulation  if  it  satisfies  equations  (3.5)-(3.8).  The 
following  proposition,  which  was  proved  in  Wein  [17],  allows  us  to  analyze  the  workload 
formulation  of  the  limiting  control  problem,  rather  than  studying  problem  (2.1)-(2.7)  di- 
rectly. 

Proposition  2.1.  Every  feasible  policy  {Y,9)  for  the  limiting  control  problem  yields 
a  corresponding  feasible  policy  {Z,U,0)  for  the  workload  formulation  and  every  feasible 
policy  [Z,  U,6)  yields  a  corresponding  feasible  policy  {¥,6). 

It  was  shown  in  Wein  [17]  that  if  the  control  process  Y  is  non-anticipating  with  respect 
to  the  Brownian  motion  A'  in  the  limiting  control  problem,  then  the  control  process  U  is 
non-anticipating  with  respect  to  the  Brownian  motion  B  in  the  workload  formulation.  It 
was  also  shown  that  the  solution  to  the  workload  formulation  remains  unchanged  whether 
9  is  non-anticipating  with  respect  to  X  or  with  respect  to  B. 

4.     Solution  to  the  Workload  Formulation 

The  solution  {U,Z,9)  to  the  workload  formulation  (3.3)-(3.8)  of  the  limiting  control 
problem  was  derived  in  Wein  [17].  This  is  a  self-contained  section  that  summarizes  the 
solution.  The  parameters  pi,Mtk  and  vi  appearing  in  this  section  are  all  defined  in  terms 
of  the  primitive  problem  data  by  definitions  (2.13).  (2.18)  and  (2.19),  respectively.  We  also 
need  to  define  the  parameters  a^.h\,h2,i'  and  if,  which  can  be  calculated  in  terms  of  the 
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primitive  problem  data.  The  parameter  cr^  is  defined  by  <r^  =  g'^M'LM^ g,  where 


P2 

-PiJ 
Without  loss  of  generality,  assume  that  the  classes  k  =  1,...,K  are  ordered  so  that 


(4.1) 


and 


arg  max  c^^{p2Mik  -  P\M2k)  =  1 

k 


argmin  c,^\p2Mik  -  P\M2k)  =  2. 

k 


(4.2) 


(4.3) 


Now  define  the  positive  coefficients  h^  and  /i2  by 


and 


Finally,  let 


and 


h,  = 


ho   = 


C2 


P1M22  -  P2^h2 
Cl 

P2M11  -  Pi  A/21' 

2sqrtn{pi  -  P2) 


i  =  y/ripi{l  -  pi). 


(4.4) 


(4.5) 


(4.6) 


(4.7) 


In  the  workload  formulation,  the  controller  observes  a  two-dimensional  Brownian  mo- 
tion process  B,  from  which  can  be  observed  the  one-dimensional  Brownian  motion  process 
B  defined  by 

Bit)  =  p2B,{t)  -  piB2it),  t>0.  (4.8) 


If  pi  ^  p2,  then  define  the  interval  endpoints  a  and  b  by 

{hi  +  /l2)/?2(l  -  Pi) 


and 


a  —  V   Mn 


b  =  i/-Mn 


hip2{l  -  Pi)  +  h2pi(l  -  p2 

(hi  +  h2)pi{l  -  P2) 
hip2{l  -  pi)  +  h2Pi{l  -  P2) 
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(4.9) 


(4.10) 


If  Pi  =  P2i  then  let 


and 


a  = 


6  = 


__h2_zL 
hi  +  hi  2i 

hi       a2 


/ii  4-/12  2^      . 
For  a  particular  realization  of  B,  define  the  control  functionals  {R,L)  by 


R{t)  =    sup  [a-  B{s)  +  L{s)]+ 

0<a<t 


and 


L(t)  =    sup  [Bis)  +  R{s)-b] 

Q<s<t 


+ 


The  two-dimensional  optimal  control  process  U  is  given  by 

R{t) 


Ui{t)  = 


P2 


and 


U2it)= 


From  the  functionals  {R,L)  in  (4.13)-(4.14),  next  define  the  process  W  by 


W{t)  =  B{t)  +  R{t)  -  L(t)     for  all  t  >  0. 


The  K-dimensional  optimal  control  process  Z  is  given  by 


Zkit)  =  { 


and 


Zk{t)  =  { 


,,"""\,  ,     if  it  =  1  and  W{t)  >  0; 

0,  if  /t  /  1  and  W{t)  >  0. 

,,^'^'^  ,,  ,     if  A:  =  2  and  W{t)  <  0; 

0,  if  A:  7^  2  and  W{t)  <  0. 


Finally,  the  optimal  control  process  6  is  given  by 


R{t) 


K 


e{t)  =  v;'[Bi{t)  +  -^  -  Y,  MikZk{t)l    for  all  t  >  0. 


P2 


Jt=l 
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(4.11: 


(4.12) 


(4.13) 


(4.14) 


(4.15) 


(4.16) 


(4.17) 


(4.18) 


(4.19) 


(4.20) 


Thus  the  solution  {U,Z,6)  to  the  workload  formulation  (3.3)-(3.8)  is  given  by  equations 
(4.15)-(4.16)  and(4.18)-(4.20).  In  the  next  two  sections,  this  solution  will  be  interpreted 
in  terms  of  the  queueing  network  model. 

5.     The  Sequencing  Rule 

In  this  section  we  describe  the  sequencing  rule,  which  is  based  on  the  control  process 
Z.  Consider  again  the  workload  formulation  (3.3)-(3.8)  of  the  limiting  control  problem. 
According  to  definition  (3.1),  we  must  have  W{t)  =  MZ{t).  This  means  that  at  any  time  i, 
the  scaled  queue  length  process  Z  can  be  any  nonnegative  vector  that  is  consistent  with  the 
present  scaled  workload  process  W.  Thus,  in  the  idealized  Brownian  approximation,  queues 
of  different  customer  classes  can  be  instantaneously  swapped  for  one  another,  as  long  as 
the  expected  work  content  remains  unchanged.  These  swaps,  which  can  be  interpreted  as 
the  reallocation  of  server  time  among  the  various  classes,  appear  to  occur  instantaneously 
because  we  are  observing  the  system  evolving  in  scaled  time. 

From  the  solution  Z  in  (4.18)-(4.19),  it  is  seen  that  only  two  of  the  A'  components  of 
Z  are  ever  positive.  These  two  components  correspond  to  the  two  customer  classes  that 
are 

&Tgmax  c'^\p2Mik  -  PiM2k)  (5-1) 

k 

and 

argmin  c-;;\p2Mik  -  PiMik),  (5.2) 

k 

which  were  denoted  by  classes  1  and  2,  respectively,  by  conventions  (4.2)-(4.3).  Further- 
more, at  each  time  t,  only  one  customer  class  has  a  positive  queue  length.  According  to 
formulas  (4.18)-(4.19),  class  1  customers  have  a  positive  queue  length  whenever  the  work- 
load imbalance  W{t)  >  0,  and  class  2  customers  have  a  positive  queue  length  whenever 
W{t)  <  0.   In  the  case  where  Ck  =  c  for  all  k  -  \, ...,  A'  (i.e.,  the  objective  is  to  minimize 
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the  long-run  average  cycle  time  of  customers),  it  is  true  that  class  1  is  served  at  station  1 
and  class  2  is  served  at  station  2.  This  is  interpreted  to  mean  that  whenever  W{t)  >  0, 
customers  of  class  1  are  only  served  when  there  are  no  other  customers  present  at  station 
1.  Similarly,  whenever  W{t)  <  0,  customers  of  class  2  are  only  served  when  there  are  no 
other  customers  present  at  station  2. 

Under  heavy  traffic  conditions,  it  does  not  matter  in  what  order  classes  2,...,K  are 
served  when  W{t)  >  0,  or  in  what  order  classes  1,3, ...,  K  are  served  when  W{t)  <  0;  it  is 
only  required  that  the  two  servers  be  kept  busy  when  there  is  work  for  them  to  do.  There 
are  two  reasons  for  this.  The  first  reason,  as  will  be  seen  in  the  next  section,  is  that  the 
asymptotically  optimal  input  rule  prevents  a  large  queue  of  customers  from  forming  at 
station  1  (respectively,  station  2)  when  W{t)  <  0  (respectively,  W{t)  >  0).  Consequently, 
in  the  scaled  space  of  the  Brownian  limit,  all  customers  at  station  1  (respectively,  station 
2)  vanish  when  W{t)  <  0  (respectively,  W{t)  >  0).  The  second  reason  is  because  the 
customer  classes  that  are  not  given  bottom  priority  will  not  see  the  queueing  system  in 
a  heavy  traffic  situation,  and  thus  their  scaled  queue  lengths  will  be  negligible  compared 
to  that  of  the  bottom  priority  classes.  This  phenomenon  of  the  normalized  queue  length 
processes  of  high  priority  customers  vanishing  in  the  heavy  traffic  limit  has  been  observed 
in  previous  work.  Whitt  [18],  Harrison  [3],  and  Reiman  [12]  have  obtained  heavy  traffic 
limit  theorems  in  a  single  station  system,  and  Johnson  [7]  and  Peterson  [11]  have  obtained 
similar  results  in  a  network  setting.  However,  a  formal  limit  theorem  has  yet  to  be  proved 
for  our  case  of  a  general  multiclass  network  with  feedback. 

To  repeat,  the  interpretation  of  formulas  (4.18)-(4.19)  is  to  give  class  1  customers 
lowest  priority  at  station  1  when  W{t)  >  0  and  give  class  2  customers  lowest  priority  at 
station  2  when  W{t)  <  0.  There  seems  to  be  some  ambiguity  that  remains  in  specifying  a 
sequencing  rule  that  emerges  from  the  solution  of  the  Brownian  control  problem.  However, 
from  (4.1)-(4.2),  when  c^  =  c  for  all  k  =  l,...,/\,  there  is  a  natural  ranking  of  the  K 
customer  classes  by  the  index  p2M\k  -  PiM2k-   We  now  propose  two  sequencing  policies 
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that  give  class  1  (respectively,  class  2)  customers  lowest  priority  at  station  1  (respectively, 
station  2)  when  W{t)  >  0  (respectively,  W{t)  <  0).  The  first  policy  is  a  static  priority 
rule  that  awards  higher  priority  at  station  1  (respectively,  station  2)  to  the  classes  with 
the  smaller  (respectively,  larger)  values  of  the  index  p2M\k  —  p\M2k- 

The  second  policy  is  obtained  by  computing  dynamic  reduced  costs  for  each  customer 
class  k  =  1,...,A'.  The  reduced  cost  for  a  class  k  customer  at  time  t  can  be  interpreted  as 
the  increase  in  the  objective  function  of  the  linear  program  (originally  stated  as  equations 
(3.1)-(3.4)  in  Wein  [17]) 

A' 

min       YckZkit)  (5.3) 

z(i),e{t)    t^ 

k=\ 
K 

subject  to    Y^KhkZk{t)  +  v^6{t)^B^[t)  +  U^{t)  (5.4) 

k=\ 
K 

Y,  M2kZk{t)  +  V2e{t)  =  B2{t)  +  U2{t)  (5.5) 

k-1 

Zk[t)  >   0,    for  k=  l,...,K  (5.6) 

per  unit  increase  in  the  righthand  side  of  the  nonnegativity  constraint  Zk{t)  >  0.  It  was 
shown  in  Wein  [17]  that  the  partial  solution  Z{t)  to  this  linear  program  yields  the  optimal 
control  process  Z  given  in  equations  (4.18)-(4.19).  It  was  also  shown  there  that  the  dual 
of  (5.3)-(5.6)  can  be  expressed  as 

max    ^\{i}  (5-') 

Tl(«)  P2 

subject  to    c^\p2Mik  -  PiM2k)-^\{t)  <    p2    for  i- =  1, ...,  A'.  (5.8) 

The  reduced  costs  at  time  t  for  the  A'  variables  in  the  dual  of  (5.7)-(5.8)  are 

P2-c;'{P2Mik-PiM2k)7Tlit)    for  A^  =  1,...,A-,  (5.9) 

where  TT*{t)  is  the  solution  to  (5.7)-(5.8).  The  higher  the  value  of  the  k-th  reduced  cost  in 
(5.9),  the  more  expensive  it  is  to  hold  class  k  customers  in  the  queue.    Furthermore,  the 
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reduced  cost  for  a  class  k  customer  is  zero  when  Zk{t)  >  0  in  (4.18)-(4.19).  Let  us  again 
assume  that  Ck  =  c  for  k  =  1,...,A'  and  consider  the  policy  that  gives  highest  priority 
at  each  time  t  to  the  customer  class  with  the  largest  reduced  cost.  Then  one  obtains  a 
dynamic  scheduling  rule  that  ranks  all  K  classes  by  the  index  p2-^iit  —  Pi-^2*  and,  at  each 
station,  serves  the  class  with  the  smallest  (respectively,  largest)  value  of  the  index  when 
W{t)  >  0  (respectively,  W{t)  <  0). 

Simulation  results  (see  Section  7)  on  several  systems  have  indicated  that  both  the 
static  priority  rule  and  the  dynamic  rule  work  well  in  conjunction  with  the  workload 
regulating  input  rule  described  in  the  next  section.  The  static  rule  has  the  advantage  that 
it  is  easier  to  implement,  since  it  does  not  depend  on  any  global  state  information.  The 
dynamic  rule  has  the  advantage  that  it  has  a  natural  generalization  to  networks  with  more 
than  two  bottleneck  stations.    " 

Thus  far  in  this  section  it  has  been  assumed  that  Ck  =  c  for  all  k  =  1,...,A'.  If  we 
relax  this  assumption,  it  does  not  necessarily  follow  that  class  1  is  served  at  station  1  and 
class  2  is  served  at  station  2.  When  this  is  indeed  still  the  case,  then  the  same  two  priority 
rules  described  earlier,  now  based  on  the  index  c^^{p2Mik  —  P\M2k),  are  the  proposed 
sequencing  policies.  If  this  is  not  the  case,  the  equations  (4.18)-(4.19)  still  suggest  giving 
class  1  customers  lowest  priority  when  W{t)  >  0  and  giving  class  2  customers  lowest 
priority  when  Wit)  <  0.  However,  additional  simulation  studies  need  to  be  performed 
before  making  a  more  specific  policy  recommendation. 

6.     The  Input  Rule 

In  this  section  the  input  rule,  which  is  based  on  the  control  processes  U  and  9,  is 
described.  In  the  workload  formulation  of  the  limiting  control  problem,  the  controller 
observes  the  two-dimensional  Brownian  motion  process  B,  exerts  the  controls  U  and  6, 
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and  obtains  the  controlled  process  W,  which  is  the  scaled  workload  process.  The  basic 
system  state  equations  (3.8)  that  govern  the  controlled  process  can  be  expressed  as 

W,{t)  =  Bi{t)  +  U,{t)  -  Vi9{t)    and  (6.1) 

W2{t)  =  B2{t)  +  U2{t)-V2e{t).  (6.2) 

Since  these  equations  are  linear  and  additive,  the  controls  U  and  6  act  as  "pushes"  on 
the  Brownian  motion  B.  Recall  that  the  non-decreasing  process  U,  represents  the  scaled 
cumulative  idleness  process  for  station  i.  Also,  0  is  the  scaled  centered  input  process,  and 
the  vector  v  is  proportional  to  the  server  utilization  levels  p. 

Since  W  =  MZ,  the  solution  Z  in  (4.18)-(4.19)  implies  that  the  workload  process 
W  resides  on  the  boundary  of  a  cone  in  the  nonnegative  orthant  of  R^.  From  equations 
(4.18)-(4.19),  it  can  be  seen  that  the  control  Ui  (respectively,  U2)  is  exerted  only  when  the 
scaled  workload  imbalance  process  W  equals  a  (respectively,  b).  Exerting  the  control  L/,  is 
interpreted  as  incurring  server  idleness  at  station  i. 

In  terms  of  the  two-dimensional  workload  process  W,  the  interval  endpoints  a  and 
b  correspond  to  reflecting  barriers  on  the  boundary  of  the  cone,  beyond  which  W  may 
not  enter.  This  situation  is  depicted  in  Figure  1,  where  W  must  reside  on  the  portion 
of  the  cone  boundary  that  is  in  boldface.  In  the  optimal  solution,  the  controls  Ui  and 
U2  are  only  exerted  when  W2(t)  =  Cj  and  Wi{t)  =  Cj,  respectively,  where  the  scaled 
threshold  levels  Cj  and  C2  can  be  calculated  explicitly  from  the  solution  to  the  workload 
formulation.  Otherwise,  only  the  input  process  6  is  used  to  keep  the  controlled  process  W 
on  the  boundary  of  the  cone.  Thus,  the  policy  that  emerges  from  the  Brownian  control 
problem  attempts  to  manipulate  input  in  lieu  of  idling  servers  and  keeps  the  workload 
process  W  on  the  boldface  portion  of  the  cone  boundary  in  Figure  1.  However,  when  the 
process  W  reaches  the  barrier  at  c*  or  the  barrier  at  Cj  in  Figure  1,  then  the  controller 
refuses  to  release  any  more  customers  into  the  system  and  is  willing  to  incur  server  idleness. 

In  order  to  see  exactly  how  the  input  is  manipulated,  recall  that  by  equation  (2.19) 

21 


FIGURE  1 

and  the  balanced  loading  conditions,  Vi  is  approximately  equal  to  V2  and  so  the  scaled 
centered  input  process  6  can  move  along  a  direction  that  is  close  to  the  45  degree  line.  The 
process  6  was  defined  in  equation  (2.8)  by 


~Xnt  -  N(nt) 


(6.3) 


where  the  process  N  is  the  cumulative  number  of  customers  released  into  the  system  up 
to  time  t.  Thus,  when  6  moves  in  the  negative  45  degree  direction,  input  is  being  witheld 
relative  to  the  nominal  input  rate,  and  when  9  moves  in  the  positive  45  degree  direction, 
input  is  being  increased  relative  to  the  nominal  input  rate.  This  is  depicted  in  Figure  2, 
where  input  is  witheld  whenever  the  workload  process  is  in  the  cone  and  input  is  increased 
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whenever  the  workload  process  is  in  the  shaded  region. 

Notice  that  in  the  actual  queueing  system,  it  may  be  possible  for  the  workload  process 
to  reside  outside  of  the  cone.  This  is  because  the  state  space  of  W  is  the  cone  {W  = 
MZ,Z  >  0},  which  contains  the  cone  pictured  in  Figure  1.  Its  extremal  rays  are  generated 
by  the  two  customer  classes 
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(6.4) 


and 


argnun— ,  (6.5) 


which  may  not  coincide  with  the  rays  in  Figure  1,  which  are  generated  by  the  two  classes 
defined  in  (5.1)-(5.2). 

The  main  goal  of  this  section  is  to  develop  an  effective  input  policy  for  the  actual 
queueing  system  that  operationalizes  the  optimal  solution  obtained  from  the  limiting  con- 
trol problem.  To  this  end,  let  us  interpret  the  word  "increase"  in  Figure  2  to  simply  mean 
"release  a  customer  into  the  system"  and  the  word  "withold"  to  simply  mean  "cesise  input". 
Then  the  naive  rule  that  emerges  from  this  interpretation  is  to  release  a  customer  into  the 
system  when  the  workload  process  W  enters  the  shaded  region  in  Figure  2.  However,  this 
naive  rule  ignores  a  major  difference  that  exists  between  the  actual  queueing  system  and 
the  idealized  heavy  traffic  limit.  This  difference  can  be  understood  by  making  the  following 
observation.  In  the  idealized  Brownian  setting,  when  the  scaled  workload  process  W  is  on 
the  lower  ray  of  the  cone  boundary  and  Wi{t)  <  c*,  then  there  are  zero  scaled  customers 
at  station  2  and  yet  station  2  is  not  idle.  Similarly,  when  W  is  on  the  upper  ray  of  the 
cone  and  Woit)  <  C2,  then  there  are  zero  customers  at  station  1  and  station  1  is  not  idle. 
This  apparent  paradox  is  due  to  the  rescaling  that  occurs  when  passing  to  the  heavy  traffic 
limit.  In  the  actual  queueing  system,  there  are  enough  customers  at  the  particular  station 
to  avoid  idleness,  but  when  looked  at  in  the  scaled  space  of  the  heavy  traffic  limit,  these 
customers  vanish. 

In  order  to  adapt  the  naive  control  rule  stated  above  to  the  actual  queueing  system,  it 
is  necessary  to  build  in  a  boundary  layer  of  thickness  e  on  the  inside  of  the  cone  boundary, 
as  shown  in  Figure  3.  This  boundary  layer  generates  a  new  cone,  which  we  call  the  e-cone, 
that  is  strictly  within  the  original  cone.  The  input  rule  is  still  to  release  a  customer  into 
the  system  whenever  the  workload  process  enters  the  shaded  region,  but  now  the  shaded 
region  is  enlarged  by  including  the  area  between  the  two  cones,  as  in  Figure  3.  This  layer, 
which  is  negligible  in  scaled  space,  prevents  the  process  W  from  straying  very  far  from  the 
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boldfaced  portion  of  the  original  cone  boundary,  but  allows  the  servers  to  be  utilized  the 
requisite  portion  of  the  time.  As  e  increases,  the  servers  will  incur  less  idleness  but  the 
queue  lengths  may  grow  as  a  result.  In  an  actual  queueing  system,  the  appropriate  setting 
of  e  will  depend  on  the  amount  of  variability  in  the  queueing  system  and  the  amount  of 
time  customers  spend  at  non-bottleneck  stations.  In  fact,  one  could  use  a  layer  of  thickness 
Ci  on  the  lower  ray  of  the  cone  boundary,  and  a  layer  of  thickness  (.2  on  the  upper  ray  of 
the  cone  boundary. 


FIGURE  3 
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Thus  the  suggested  input  rule  is  to  release  a  customer  whenever  the  workload  process 

enters  the  shaded  region  of  Figure  3.  This  region  can  be  calculated  explicitly  in  terms  of 

the  problem  data  and  the  parameters  ej  and  (.1.  The  cone  in  Figure  1  is  generated  by  the 

rays 

W2  -  ^W^i  =  0  •  (6.6) 

Mil 


and 


W,  -  ^^W2  =  0.  (6.7) 

M22 


Therefore  the  regions  outside  of  the  e-cone  are 


and 


T^^2  -  ^^^1  <  fi  (6.8) 

A/11 


^^i-T7^^2<e2.  (6.9) 

A/22 


From  (2.37),  (3.8)  and  (4.18)-(4.19),  c*  and  c\  can  be  solved  for  explicitly.  The  solution  is 

P2  A/11  -pi  A/21 


and 

A/2  2  a 


Co     = 


(6.11) 


P2A/12  -  PiA/22' 
where  a  and  h  are  the  optimal  interval  endpoints  from  the  solution  to  the  Brownian  control 

problem. 

Notice  that  \\\  c\  and  c\  are  all  in  scaled  terms,  and  in  order  to  find  an  appropriate 

policy  for  the  original  queueing  system,  some  unsealing  needs  to  be  done.    By  definitions 

(3.1)  and  the  standard  heavy  traffic  scaling  described  in  Section  5  of  Harrison  [4],  it  can 

be  seen  that 

»•(.)  =  ^,  (6.12) 

where  w  is  the  unscaied  workload  process  defined  in  the  introduction  by 

w{t)  =  MQ{t),  t  >  0,  (6.13) 
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and  Q  is  the  actual  queue  length  process.  Define  w*  =  \/nc*  for  i  =  1, 2  to  be  the  threshold 
levels  for  the  input  policy.  Then  the  suggested  input  rule  is  to  release  a  customer  into  the 
system  at  times  t  such  that  either 

wi{t)  <  wl    and  (6-14) 

y^2it)  -  ^w,{t)  <  e,,  (6.15) 


or 


W2{t)  <  1^2    and  (6.16) 

AI 
^'iii)  -  TT-Mt)  <  ^2.  (6.17) 

Here,  ei  and  €2  are  parameters  that  can  be  set  in  order  to  achieve  a  desired  output  rate. 
As  will  be  seen  in  the  next  section,  the  setting  of  these  parameters  is  quite  simple,  at  least 
when  there  are  no  non-bottleneck  stations  in  the  queueing  system. 


7.     An  Example 

The  scheduling  rules  stated  in  Sections  5  and  6  will  be  illustrated  by  means  of  an 
example.  The  example  will  have  two  customer  types,  A  and  B,  and  there  is  a  50-50  product 
mix  that  is  specified,  so  that  customers  are  released  into  the  system  in  the  order  ABABAB... 
As  seen  in  Figure  4,  customer  type  A  has  two  stages  on  its  route  and  customer  type  B  has 
four  stages.  The  six  customer  classes  are  designated  (and  ordered  from  k  =  1,  ...,6)  by  Al, 
A2,  Bl,  B2,  B3  and  B4,  since  each  class  corresponds  to  a  type-stage  pair. 

The  mean  service  times  (in  arbitrary  time  units)  for  each  customer  class  are  indicated 
in  Figure  4.  For  concreteness  (since  simulation  results  will  be  exhibited),  all  service  times 
are  assumed  to  be  exponential,  although  our  results  hold  for  any  service  time  distributions 
with  finite  mean  and  variance.  Calculation  of  the  2x6  workload  profile  matrix  M  yields 

,,       /4     0     10      2      2     0\  ,_    . 

^^=[1  1  13  13  7  ?;•  ^^-^^ 
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FIGURE  4 

From  (2.19),  v\  =  v^  =  1  ■,  so  assuming  pi  =  p2  —  .9  (and  letting  n  =  100),  we  obtain 
a  target  long-run  average  output  rate  A  =  .1286  customers  per  unit  time.  As  in  the 
original  problem  formulation,  the  objective  is  to  minimize  the  long-run  average  cycle  time 
of  customers  subject  to  meeting  the  average  output  rate  of  .1286  customers  per  unit  time. 
From  (7.1),  one  obtains  the  indices  (since  pi  —  p2,  they  need  not  enter  into  the  indices) 

f\\    fiA      ^1    9,1    ():,'2,     ^if 


Afifc-A/2fc  =  (3   -1-3-11-5-7)        for    fc  =  l,...,6. 


(7.2) 


Thus,  the  suggested  static  sequencing  rule  gives  priorities  (from  highest  to  lowest)  in  the 
order  (B3,  Bl,  Al)  at  station  1  and  (A2,  B4,  B2)  at  station  2. 

In  order  to  find  the  suggested  input  policy,  the  interval  endpoints  a  and  b  need  to  be 
found.  Using  formulas  (4.11)-(4.12)  yields  a  =  — .4365ct^  and  b  —  .ligcr'^.  Going  through 
the  necessary  calculations  in  (2.16),  (3.2)  and  (4.1)  gives  a^  =  10.93.  Using  equations 
(6.10)-(6.11),  one  can  compute  the  values  of  c*  =  1.93  and  Cj  =  6.26.  Upon  unsealing,  the 
threshold  levels  are  found  to  be  u>*  =  19.3  time  units  of  work  and  W2  =  62.6  time  units  of 
work . 

Since  W  only  takes  on  integer  values  in  our  example,  it  follows  from  equations  (6.14)- 
(6.17)  that  the  suggested  input  rule  is  to  release  a  customer  into  the  system  at  times  t 
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such  that  either 


wi{t)<l9     and  (7.3) 

W2{t)  -  ^Wi{t)  <  €u  (7.4) 


or 


W2{t)  <  62     and  (7.5) 

wiit)  -  ^W2{t)  <  €2.  (7.6) 

Using  the  example  network,  a  simulation  study  was  undertaken  to  compare  the  per- 
formance of  the  suggested  scheduling  rule  against  conventional  input  and  sequencing  rules. 
Three  input  rules  were  tested:  the  suggested  input  rule  (abbreviated  by  WR{€i,€2)  for 
workload  regulating  input,  where  ej  and  €2  are  the  boundary  layer  thicknesses  used);  closed 
loop  input  (abbreviated  by  CL{N),  where  N  is  the  total  number  of  customers  in  the  net- 
work); and  deterministic  input,  where  the  interarrival  times  are  constant.  For  all  input 
rules,  customers  entered  the  system  in  the  order  ABABAB...  Five  sequencing  rules  were 
compared:  first-in  first-out  (FIFO);  shortest  expected  processing  time  (SPT);  shortest 
expected  remaining  processing  time  (SRPT);  the  asymptotically  optimal  sequencing  rule 
(abbreviated  by  ST(A/i  —  M2));  and  the  rule  based  on  the  dynamic  reduced  costs  that 
was  described  in  Section  5  (abbreviated  by  DY(A/i  —  M2)).  Another  common  rule  in  the 
scheduling  literature  is  the  least  work  next  queue  (LWNQ)  rule.  This  policy  gives  priority 
to  the  customer  who  is  going  next  to  the  queue  that  has  the  least  expected  amount  of 
work  in  it.  The  LWNQ  rule  is  not  relevant  here,  since  all  customers  at  station  1  go  next 
to  station  2,  and  all  customers  at  station  2  go  next  to  station  1  or  exit  the  system. 

The  results  of  the  simulation  study  are  summarized  in  Table  1.  Each  row  gives  statis- 
tics for  a  particular  scheduling  policy,  which  is  specified  by  a  particular  input  control  rule 
paired  with  a  specific  sequencing  rule.  The  first  two  columns  of  Table  1  state  the  schedul- 
ing policy.  For  each  policy  tested,  ten  independent  runs  were  made,  each  consisting  of  2000 
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customer  completions.  The  third  column  gives  the  average  throughput  rate  (in  customers 
per  unit  time)  over  the  ten  runs,  along  with  a  95%  confidence  interval.  The  fourth  column 
of  Table  1  contains  the  average  cycle  time  of  customers  over  the  ten  runs,  along  with  a 
95%  confidence  interval  for  this  value.  Rather  than  use  the  target  average  throughput  rate 
A  =  .1286  customers  per  unit  time,  it  was  more  convenient  to  choose  the  parameters  of  the 
closed  input  policies  and  workload  regulating  input  policies  so  as  to  achieve  a  throughput 
rate  of  .127  customers  per  unit  time.  This  average  throughput  rate  corresponds  to  an 
average  server  utilization  of  88.9%.  To  allow  for  easy  comparisons  of  the  average  cycle 
times  for  the  various  policies,  all  simulation  runs  achieved  this  target  output  rate. 

Each  simulation  run  had  no  initialization  period,  and  all  runs  began  with  an  empty 
system.  For  closed  loop  input  runs,  the  customers  arrived  according  to  deterministic  input 
(at  the  same  rate  as  the  corresponding  open  models)  until  the  network  had  reached  its 
population  limit,  and  then  closed  loop  input  was  used. 

Referring  to  the  results  in  Table  1,  it  is  seen  that  workload  regulating  input  in  com- 
bination with  either  of  the  sequencing  rules  described  in  Section  5  easily  outperformed  all 
other  combinations  of  input  and  sequencing  rules.  The  difference  in  performance  between 
the  ST(Mi  —  M2)  and  DY(A/i  —  il/2)  rules  was  not  statistically  significant.  They  both 
achieved  nearly  a  30%  reduction  in  average  cycle  time,  compared  to  the  next  best  schedul- 
ing rule,  which  was  the  closed  loop  input  in  combination  with  the  ST{Mi  —M2)  sequencing 
rule.  This  sequencing  rule,  which  was  shown  in  Harrison  and  Wein  [5]  to  maximize  the 
throughput  rate  of  a  two-station  closed  queueing  network  in  heavy  traffic,  achieved  a  30% 
reduction  in  average  cycle  time  compared  to  FIFO  in  the  closed  loop  input  case. 

Since     the     workload     regulating     input     rule     was     derived     jointly     with     the 
ST{Mi  —  M2)  and  DY{Mi  —  A/2)  rules,  the  input  rule  was  not  tested  in  combination 
with  the  other  three  sequencing  rules.    Similarly,  the  ST{Mi  —  M2)  and  DY{Mi  —  A/2) 
rules  were  not  tested  in  combination  with  input  rules  with  which  they  were  not  derived. 
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THROUGHPUT 

CYCLE 

INPUT 

SEQUENCING 

RATE 

TIME 

RULE 

RULE 

(95%  C.LI 

(95%  C.I.) 

DETERMINISTIC 

FIFO 

.127(±.000) 

92.0(±4.1) 

DETERMINISTIC 

SPT 

.127(±.000) 

73.8(±3.1) 

DETERMINISTIC 

SRPT 

.127(±.000) 

66.6(±2.5) 

CL(IO) 

FIFO 

.127(±.001) 

78.6(±0.7) 

CL(IO) 

SPT 

.127(±.001) 

78.2(±0.8) 

CL(8) 

SRPT 

.127(±.001) 

62.5(±0.6) 

CL(7) 

ST  (Ml  - 

-A/2) 

.127(±.001) 

54.9(±0.4) 

WR(1,1) 

ST  (Ml  - 

-A/2) 

.127(±.001) 

38.6(±0.9) 

WR(1,1) 

DY(Mi 

-  A/2) 

.127(±.001) 

38.9(±0.9) 

TABLE  1 

The  ease  of  implementation  and  accuracy  of  the  workload  regulating  input  rule  was 
even  more  impressive  than  its  actual  performance.  Values  of  Ci  =  £2  =  1  achieved  the  target 
output  rate  of  .127  customers  per  unit  time.  Furthermore,  in  order  to  test  the  accuracy 
of  the  threshold  levels  of  19  and  62  derived  in  (7.3)  and  (7.5)  from  the  Brownian  control 
problem,  a  search  over  all  values  of  threshold  levels  was  made,  while  keeping  €1  =  62  =  1- 
The  best  threshold  levels  achieved  only  a  2.1%  improvement  over  the  derived  values  of  19 
and  62.  Thus,  although  the  Brownian  network  model  appears  to  be  a  rather  crude  model  at 
first  glance,  its  results  are  surprisingly  accurate,  at  least  when  there  are  no  non-bottleneck 
stations  present  in  the  network. 
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